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An instance of a random constraint satisfaction problem defines 
a random subset S (the set of solutions) of a large product space 
X N (the set of assignments). We consider two prototypical prob- 
lem ensembles (random / -satisfiability and (/-coloring of random 
regular graphs), and study the uniform measure with support on 
<S. As the number of constraints per variable increases, this mea- 
sure first decomposes into an exponential number of pure states 
('clusters'), and subsequently condensates over the largest such 
states. Above the condensation point, the mass carried by the n 
largest states follows a Poisson-Dirichlet process. 
For typical large instances, the two transitions are sharp. We de- 
termine for the first time their precise location. Further, we provide 
a formal definition of each phase transition in terms of different 
notions of correlation between distinct variables in the problem. 
The degree of correlation naturally affects the performances of 
many search/sampling algorithms. Empirical evidence suggests 
that local Monte Carlo Markov Chain strategies are effective up to 
the clustering phase transition, and belief propagation up to the 
condensation point. Finally, refined message passing techniques 
(such as survey propagation) may beat also this threshold. 

Phase transitions | Random graphs | Constraint satisfaction problems | 
Message passing algorithms 

Constraint satisfaction problems (CSPs) arise in a large spectrum 
of scientific disciplines. An instance of a CSP is said to be satisfi- 
able if there exists an assignment of TV variables (asi, x%, . . . , xn) = 
x, Xi g X (X being a finite alphabet) which satisfies all the con- 
straints within a given collection. The problem consists in find- 
ing such an assignment or show that the constraints are unsatisfi- 
able. More precisely, one is given a set of functions ip a ■ X k —* 
{0, 1}, with a £ {1, . . . , M} = [M] and of fc-tuples of indices 
{i a (l), . . . , i a (fc)} C [TV], and has to establish whether there exists 
x G X such that i>a(%i a (i), . . . , x ia ( k )) = 1 for all a's. In this arti- 
cle we shall consider two well known families of CSP's (both known 
to be NP-complete (TJ): 

(i) fc-satisfiability (fc-SAT) with k > 3. In this case 
X — {0, 1}. The constraints are defined by fix- 
ing a fc-tuple (z a (l), ■ ■ ■ , z a (k)) for each a, and set- 
ting 1pa(x ia (T}, X ia ( k) ) = if (ji,(l),...,!Ci a (i)) = 

(z a (l), . . . , z a (k)) and = 1 otherwise. 

(ii) q-coloring (g-COL) with q > 3. Given a graph G with TV ver- 
tices and M edges, one is asked to assign colors Xi 6 X = 
{1, . . . , q} to the vertices in such a way that no edge has the 
same color at both ends. 

The optimization (maximize the number of satisfied constraints) 
and counting (count the number of satisfying assignments) versions 
of this problems are defined straightforwardly. It is also convenient 
to represent CSP instances as factor graphs |2|, i.e. bipartite graphs 
with vertex sets [TV], [M] including an edge between node i 6 [TV] 




Fig. 1 . The factor graph of a small CSP allows to define the distance d(i, j) 
between variables Xi and Xj (filled squares are constraints and empty circles 
variables). Here, for instance, d(6, 1) = 2 and d(3, 5) = 1. 

and a £ [M] if and only if the i-th variable is involved in the a-th 
constraint, cf. Fig. Q] This representation allows to define naturally a 
distance d(i,j) between variable nodes. 

Ensembles of random CSP's (rCSP) were introduced (see e.g. (3)) 
with the hope of discovering generic mathematical phenomena that 
could be exploited in the design of efficient algorithms. Indeed several 
search heuristics, such as Walk-SAT |4| and 'myopic' algorithms [5] 
have been successfully analyzed and optimized over rCSP ensembles. 
The most spectacular advance in this direction has probably been the 
introduction of a new and powerful message passing algorithm ('sur- 
vey propagation', SP) [6 |. The original justification for SP was based 
on the (non-rigorous) cavity method from spin glass theory. Subse- 
quent work proved that standard message passing algorithms (such as 
belief propagation, BP) can indeed be useful for some CSP's f7l[8ll9l. 
Nevertheless, the fundamental reason for the (empirical) superiority 
of SP in this context remains to be understood and a major open prob- 
lem in the field. Building on a refined picture of the solution set of 
rCSP, this paper provides a possible (and testable) explanation. We 
consider two ensembles that have attracted the majority of work in the 
field: (i) random fc-SAT: each fc-SAT instance with TV variables and 
M = Na clauses is considered with the same probability; (ii) q-COh 
on random graphs: the graph G is uniformly random among the ones 
over TV vertices, with uniform degree / (the number of constraints is 
therefore M = TVZ/2). 

Phase transitions in random CSP. It is well known that rCSP's 
may undergo phase transitions as the number of constraints per vari- 
able a is variecfl The best known of such phase transitions is the 
SAT-UNSAT one: as a crosses a critical value a s (fc) (that can, in 
principle, depend on TV), the instances pass from being satisfiable to 
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Table 1. Critical connectivities for the dynamical, con- 
densation and satisfiability transitions in k-SAT and q-COL 
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unsatisfiable with high probabilit}Q 1 1 1 . For fc-SAT, it is known that 
a a (2) = 1. A conjecture based on the cavity method was put forward 
in 1 6 1 for all fc > 3 that implied in particular the values presented in Ta- 
bleland a s (fc) = 2 fc log 2 - i (1 + log 2) + Q(2~ k ) for large fc fTTl. 
Subsequently it was proved that o s (fc) > 2 k log 2 — O(fc) confirming 
this asymptotic behavior 1 12|. An analogous conjecture for q-coloring 
was proposed in 1131 yielding, for regular random graphs [14], the val- 
ues reported in Table[TJand ? s (q) = 2q log q— log g— l+o(l) for large 
q (according to our convention, random graphs are whp uncolorable if 
I > l a (q))- It was proved in 1 15 . 12| that l s (q) = 2g log q — 0(log g). 

Even more interesting and challenging are phase transitions in the 
structure of the set 5 C X of solutions of rCSP's ('structural' phase 
transitions). Assuming the existence of solutions, a convenient way 
of describing S is to introduce the uniform measure over solutions 
fi(x): 

1 n 

/«fe) = ^ II ^a{x ia (i),. . -,X ia(k) ) , [1] 

a = l 

where Z > 1 is the number of solutions. Let us stress that, since <S 
depends on the rCSP instance, /i( • ) is itself random. 

We shall now introduce a few possible 'global' characterizations 
of the measure /i( ■ ). Each one of these properties has its counter- 
part in the theory of Gibbs measures and we shall partially adopt that 
terminology here [ 17 1 . 

In order to define the first of such characterizations, we let i £ [N] 
be a uniformly random variable index, denote as Xj the vector of vari- 
ables whose distance from i is at least I, and by n(xi \x e ) the marginal 
distribution of Xi given Xj. Then we say that the measure flU] satisfies 
the uniqueness condition if, for any given i £ [N] , 

E sup >J \n(xi\x t ) — /J.(xi\x e )\ ^ . [2] 

as I — > oo (here and below the limit N — *• oo is understood to be taken 
before I — * oo). This expresses a 'worst case' correlation decay con- 
dition. Roughly speaking: the variable x% is (almost) independent of 
the far apart variables x_ t irrespective is the instance realization and the 
variables distribution outside the horizon of radius I. The threshold for 
uniqueness (above which uniqueness ceases to hold) was estimated 
in GQ for random fc-SAT, yielding a u (fc) = (21ogfc)/fc[l + o(l)] 
(which is asymptotically close to the threshold for the pure literal 
heuristics) and in [ 18 1 for coloring implying l u (q) = q for q large 
enough (a 'numerical' proof of the same statement exists for small q). 
Below such thresholds BP can be proved to return good estimates of 
the local marginals of the distribution Qj|]. 

Notice that the uniqueness threshold is far below the S AT-UNS AT 
threshold. Furthermore, several empirical studies 1 19, 20 1 pointed out 
that BP (as well as many other heuristics [ 4 , 5 1) is effective up to much 
larger values of the clause density. In a remarkable series of papers 
1211 [5J, statistical physicists argued that a second structural phase 
transition is more relevant than the uniqueness one. Following this 
literature, we shall refer to this as the 'dynamic phase transition' (DPT) 
and denote the corresponding threshold as Qd(fc) (or Zd (<?))• In order 
to precise this notion, we provide here two alternative formulations 




a d ,+ ay a c a s 

Fig. 2. Pictorial representation of the different phase transitions in the set of 
solutions of a rCSP. At a d + some clusters appear, but for a d + < a < o d they 
comprise only an exponentially small fraction of solutions. Foray < a < a c the 
solutions are split among about e NS * clusters of size e * . If a c < a < a s 
the set of solutions is dominated by a few large clusters (with strongly fluctuating 
weights), and above q s the problem does not admit solutions any more. 

corresponding to two distinct intuitions. According to the first one, 
above ay(fc) the variables (xi, . . . ,xm) become globally correlated 
under fi( ■ ). The criterion in []2]] is replaced by one in which far apart 
variables x e are themselves sampled from fj, ('extremality' condition): 

V(Xl)/~2 K^ife) _ K X i)\ -> 0. [3] 

as £ — > oo. The infimum value of a (respectively I) such that 
this condition is no longer fulfilled is the threshold Qd(fc) (ld(k)). 
Of course this criterion is weaker than the uniqueness one (hence 
a d (fc) > Qu(fc)). 

According to the second intuition, above a d (k), the measure 
rflTl decomposes into a large number of disconnected 'clusters'. 
This means that there exists a partition {A n } n= i...j^ of X (de- 
pending on the instance) such that: (i) One cannot find n such 
that fj,(A„) — > 1; (it) Denoting by d t A the set of configurations 
x G X N \A whose Hamming distance from A is at most Ne, we have 
fj.(d e A n ) I fj,(A„)(l — jU(An)) — > exponentially fast in for all n 
and e small enough. Notice that the measure n can be decomposed as 

M(') = ^»«nfln(-) , [4] 
n=l 

where w„ = fj,(A n ) and fJ, n (-) = A*( ■ \ A n ). We shall always refer 
to {A n } as the 'finer' partition with these properties. 

The above ideas are obviously related to the performance of algo- 
rithms. For instance, the correlation decay condition in [J3J] is likely to 
be sufficient for approximate correctness of BP on random formulae. 
Also, the existence of partitions as above implies exponential slowing 
down in a large class of MCMC sampling algorithm 

Recently, some important rigorous results were obtained support- 
ing this picture [22 23 1. However, even at the heuristic level, several 
crucial questions remain open. The most important concern the dis- 
tribution of the weights {w n }: are they tightly concentrated (on an 
appropriate scale) or not? A (somewhat surprisingly) related ques- 
tion is: can the absence of decorrelation above aa(fc) be detected by 
probing a subset of variables bounded in A^? 

SP [6 1 can be thought as an inference algorithm for a modified 
graphical model that gives unit weight to each cluster |24, 20], thus 
tilting the original measure towards small clusters. The resulting per- 
formances will strongly depend on the distribution of the cluster sizes 
w n , Further, under the tilted measure, Qd(fc) is underestimated be- 
cause small clusters have a larger impact. The correct value was never 
determined (but see 1 16 1 for coloring). The authors of 1251 undertook 



-One possible approach to the definition of a MCMC algorithm is to relax the constraints by 
setting */> a (- ■ ■ ) — e instead of wheneverthe a-th constraint is violated. Glauber dynamics 
can then be used to sample from the relaxed measure Me ( ■ )■ 
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m(a) 



a c {k) a s (k) 
Fig. 3. The Parisi 1 RSB parameter m(a) as a function of the constraint den- 
sity q. In the inset, the complexity S(s) as a function of the cluster entropy for 
a = a s (k) — 0.1 (the slope at S(s) = is —m(a)). Both curves have been 
computed from the large k expansion. 



the technically challenging task of determining the cluster size distri- 
bution, without however clarifying several of its properties. 

In this paper we address these issues, and unveil at least two un- 
expected phenomena. Our results are described in the next Section 
with a summary just below. Finally we will discuss the connection 
with the performances of SP. Some technical details of the calculation 
are collected in the last Section. 

Results and discussion 

The formulation in terms of extremality condition, cf. Eq. [|3ll. allows 
for an heuristic calculation of the dynamic threshold aa (k). Previous 
attempts were based instead on the cavity method, that is an heuristic 
implementation of the definition in terms of pure state decomposition, 
cf. Eq. [|4l). Generalizing the results of 1 16 1, it is possible to show that 
the two calculations provide identical results. However, the first one 
is technically simpler and under much better control. As mentioned 
above we obtain, for all k > 4 a value of ay (k) larger than the one 
quoted in (6"1[TT1. 

Further we determined the distribution of cluster sizes w n , thus 
unveiling a third 'condensation' phase transition at a c (k) > ay (ft) 
(strict inequality holds for k > 4 in SAT and q > 4 in coloring, see 
below). For a < a c (k) the weights w n concentrate on a logarith- 
mic scale (namely — logw„ is Q(N) with &(N 1 ^ 2 ) fluctuations). 
Roughly speaking the measure is evenly split among an exponential 
number of clusters. 

For a > a c (k) (and < a B (k)) the measure is carried by a subex- 
ponential number of clusters. More precisely, the ordered sequence 
{«)„} converges to a well known Poisson-Dirichlet process {to* }, first 
recognized in the spin glass context by Ruelle 1 26 1 . This is defined by 
Wn — x n /y~) x n, where x n > are the points of a Poisson process 
wifhrate x" 1 "'™^ and m(a) £ (0,1). This picture is known in spin 
glass theory as 'one step replica symmetry breaking' (1RSB) and has 
been proven in Ref. 1271 for some special models. The 'Parisi 1RSB 
parameter' m(a) is monotonically decreasing from 1 to when a 
increases from a c (k), to a s (k), cf. Fig. [3] 

Remarkably the condensation phase transition is also linked to 
an appropriate notion of correlation decay. If . . . , i(n) 6 [N] 
are uniformly random variable indices, then, for a < a c (k) and any 
fixed n : 

E I^C^iW •■• x i(n)) - K x i(l)) ' ' 'K x i(n))\ ~> [ 5 ] 

{*<(.)} 

as N — > oo. Conversely, the quantity on the left hand side remains 
positive for a > a c (k). It is easy to understand that this condition is 
even weaker than the extremality one, cf. Eq. in that we probe 



correlations of finite subsets of the variables. In the next two Sections 
we discuss the calculation of ay and a c . 

Dynamic phase transition and Gibbs measure extremality. 

A rigorous calculation of ay, (ft) along any of the two definitions pro- 
vided above, cf. Eqs. |j3j] and |j4ll remains an open problem. Each of 
the two approaches has however an heuristic implementation that we 
shall now describe. It can be proved that the two calculations yield 
equal results as further discussed in the last Section of the paper. 

The approach based on the extremality condition in [J3jj relies on 
an easy-to-state assumption, and typically provides a more precise 
estimate. We begin by observing that, due to the Markov structure of 
/i( • ), it is sufficient for Eq. to hold that the same condition is ver- 
ified by the correlation between Xi and the set of variables at distance 
exactly £ from i, that we shall keep denoting as x e . The idea is then to 
consider a large yet finite neighborhood of i, Given £ > £, the factor 
graph neighborhood of radius £ around i converges in distribution to 
the radius-^ neighborhood of the root in a well defined random tree 
factor graph T. 

For coloring of random regular graphs, the correct limiting tree 
model T is coloring on the infinite i-regular tree. For random fc-SAT, 
T is defined by the following construction. Start from the root vari- 
able node and connect it to I new function nodes (clauses), I being a 
Poisson random variable of mean ka. Connect each of these function 
nodes with k — 1 new variables and repeat. The resulting tree is infi- 
nite with non-vanishing probability if a > l/k(k — 1). Associate a 
formula to this graph in the usual way, with each variable occurrence 
being negated independently with probability 1/2. 

The basic assumption within the first approach is that the ex- 
tremality condition in []3]j can be checked on the correlation between 
the root and generation-^ variables in the tree model. On the tree, 
jtx( ■ ) is defined to be a translation invariant Gibbs measure [ 17 1 asso- 
ciated to the infinite factor graprQr (which provides a specification). 
The correlation between the root and generation-^ variables can be 
computed through a recursive procedure (defining a sequence of dis- 
tributions Pi, see Eq. 111511 below). The recursion can be efficiently 
implemented numerically yielding the values presented in TableQ~]for 
k (resp. q)= 4, 5, 6. For large k (resp. q) one can formally expand 
the equations on Pi and obtain 



Q d (fc) 



log k + log log k + 7<j + O 



log log k 
log k 



ld(q) = q [log g + log log g + 7d + o(l)] 



[61 
[7] 



with 7d = 1 (under a technical assumption on the structure of Pi). 

The second approach to the determination of ay (ft) is based on 
the 'cavity method' 161 1251 . It begins by assuming a decomposition in 
pure states of the form |j4ll with two crucial properties: (i) If we denote 
by W n the size of the n-th cluster (and hence w n — W„/ "^2 W n ), 
then the number of clusters of size W„ = e Ns grows approximately 
as e (s , (ii) For each single-cluster measure jj,n ( • ), a correlation 
decay condition of the form [J3j] holds. 

The approach aims at determining the rate function E(s), 'com- 
plexity': the result is expressed in terms of the solution of a distri- 
butional fixed point equation. For the sake of simplicity we describe 
here the simplest possible scenario?] resulting from such a calcula- 
tion, cf. Fig. [4] For a < ay,-oo(fe) the cavity fixed point equation 
does not admit any solution: no clusters are present. At ad,-oo(k) a 
solution appears, eventually yielding, for a > ad,+ a non-negative 



4 More precisely p( ■ ) is obtained as a limit of free boundary measures (further details in [28]). 
5 The precise picture depends on the value of k {resp. q) and can be somewhat more compli- 
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Fig. 4. The complexity function (the number of clusters with entropy density 
s is e JVE ( s >) for the 6-colorings of i-regular graphs with Z 6 {17, 18, 19, 20}. 
Circles indicate the dominating states with entropy s„; the dashed lines have 
slopes £'(»*) = -1 for I = 18 and £'(s«) = -0.92 for Z = 19. The 
dynamic phase transition is id (6) = 18, the condensation one id (6) = 19, and 
the SAT-UNSAT one Z a (6) = 20. 
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Fig. 5. Correlation function [|3]] between the root and generation I variables in 
a random fc-SAT tree formula. Here k = 4 and (from bottom to top) a = 9.30, 
9.33, 9.35, 9.40 (recall that a d (4) « 9.38). In the inset, the complexity £(s*) 
of dominant clusters as a function of a for 4-SAT. 

complexity £(s) for some values of s £ R+. The maximum and 
minimum such values will be denoted by s max and s m i n . At a strictly 
larger value ad,o(k), E(s) develops a stationary point (local max- 
imum). It turns out that otd,o(k) coincides with the threshold com- 
puted in (MEDE! In particular a d ,o (4) « 8.297, ay,o(5) » 16.12, 
a d , (6) w 30.50 and Z d , (4) = 9, Z d ,o(5) = 13, Z d>0 (6) = 17. For 
large k (resp. q), ay,o(fc) admits the same expansion as in Eqs. flSJ, 
[TTll with 7d,o = 1 — log 2. However, up to the larger value Qd(fc), the 
appearance of clusters is irrelevant from the point of view of n( ■ ). In 
fact, within the cavity method it can be shown that e JV ' s+E ' s " remains 
exponentially smaller than the total number of solutions Z: most of 
the solutions are in a single "cluster". The value ay(fc) is determined 
by the appearance of a point s» with E'(s») = — 1 on the complexity 
curve. Correspondingly, one has Z ~ e W[s(«»)+s,]. most Q f me so _ 
lutions are comprised in clusters of size about e Ns " . The entropy per 
variable <j> = linijv^cx> iV _1 log Z remains analytic at ay (ft). 

Condensation phase transition. As a increases above ad, S(s«) 
decreases: clusters of highly correlated solutions may no longer sat- 
isfy the newly added constraints. In the inset of Fig. [5] we show 
the a dependency of E(s») for 4-SAT. In the large fc limit, with 

a = p2 k we get E(s*) = log2 - p - log2e" fcp + 0(2~ ft ), and 
s, =log2e- fep + 0(2- k ). 

The condensation point a c (k) is the value of a such that E(s*) 
vanishes: above a c (k), most of the measure is contained in a sub- 
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Fig. 6. Performance of BP heuristics on random 4-SAT formulae. The residual 
entropy per spin N _1 log Z (here we estimate it within Bethe approximation) as 
a function of the fraction of fixed variables. t max = 20 in these experiments. 

exponential number of large cluster jj| Our estimates for a c (k) are 
presented in Table[T](see also Fig. |4]for E(s) in the 6-coloring) while 
in the large-fc limit we obtain a c (k) = 2 fc log 2 - § log 2 + 0(2~ fc ) 
[recall that the SAT-UNSAT transition is at a B (k) = 2 fe log2 - 
±±!^+0(2- fc )]and/ c (<7) = 2glogg-logg-21og2 + o(l) [with 
the COL-UNCOL transition at l a (q) = 2q log q - log q - 1 + o(l)]. 
Technically the size of dominating clusters is found by maximiz- 
ing E(s) + s over the s interval on which E(s) > 0. For a G 
[ot c (k), a s (k)], the maximum is reached at s max , with E(s max ) = 
yielding (f> — s max . It turns out that the solutions are comprised 
within a finite number of clusters, with entropy e JVs m ax + A i where 
A = O(l). The shifts A are asymptotically distributed according to 
a Poisson point process of rate e~ m ^ A with m(a) = — E'(s max ). 
This leads to the Poisson Dirichlet distribution of weights discussed 
above. Finally, the entropy per variable tf> is non-analytic at a c (k). 

Let us conclude by stressing two points. First, we avoided the 
3-SAT and 3-coloring cases. These cases (as well as the 3-coloring 
on Erdos-Renyi graphs |25|) are particular in that the dynamic tran- 
sition point ad is determined by a local instability (a Kesten-Stigum 
[ 29 1 condition, see also |21|), yielding ay (3) ~ 3.86 and ia(3) = 6 
(the case I = 5, q = 3 being marginal). Related to this is the fact that 
etc — ay: throughout the clustered phase, the measure is dominated 
by a few large clusters (technically, E(s») < for all a > ad)- Sec- 
ond, we did not check the 'local stability' of the 1RSB calculation. 
By analogy with (30|, we expect that an instability can modify the 
curve E(s) but not the values of «d and a c . 

Algorithmic implications. Two message passing algorithms were 
studied extensively on random fc-SAT: belief propagation (BP) and 
survey propagation (SP) (mixed strategies were also considered in 
1191 1201 ). A BP message v u -, v {x) between nodes u and v on the 
factor graph is usually interpreted as the marginal distribution of x u 
(or x v ) in a modified graphical model. An SP message is instead a 
distribution over such marginals P u ^ v (u). The empirical superiority 
of SP is usually attributed to the existence of clusters |6|: the distri- 
bution P u ^ v {v) is a 'survey' of the marginal distribution of x u over 
the clusters. As a consequence, according to the standard wisdom, SP 
should outperform BP for a > oy(fc). 

This picture has however several problems. Let us list two of 
them. First, it seems that essentially local algorithms (such as mes- 
sage passing ones) should be sensitive only to correlations among 



6 Notice that for q-coloring, since I is an integer, the 'condensated' regime [Z c (g), l s (q)] maybe 
empty: This is the case for q— 4. On the contrary, g— 5isalwayscondensatedforZ d < I < l e . 
7 This paradox was noticed independently by Dimitris Achlioptas (personal communication). 
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finite subsets of the variable^ and these remain bounded up to the 
condensation transition. Recall in fact that the extremality condition 
in [J3jj involves a number of variables unbounded in N, while the 
weaker in []5]] is satisfied up to a c (k). 

Secondly, it would be meaningful to weight uniformly the solu- 
tions when computing the surveys P u ^, v {y). In the cavity method 
jargon, this corresponds to using a 1RSB Parisi parameter r = 1 
instead of r = as is done in [6]. It is a simple algebraic fact of 
the cavity formalism that for r = 1 the means of the SP surveys 
satisfy the BP equations. Since the means are the most important 
statistics used by SP to find a solution, BP should perform roughly 
as SP. Both arguments suggest that BP should perform well up to the 
condensation point ct c {k). We tested this conclusion on 4-SAT at 
a = 9.5 £ (ay(4), a c (4)), through the following numerical experi- 
ment, cf. Fig. [6] (i) Run BP for i max iterations, (ii) Compute the BP 
estimates Vi(x) for the single bit marginals and choose the one with 
largest bias, (iii) Fix Xi = or 1 with probabilities Vi(0), 
(iv) Reduce the formula accordingly (i.e. eliminate the constraints 
satisfied by the assignment of Xi and reduce the ones violated). This 
cycle is repeated until a solution is found or a contradiction is en- 
countered. If the marginals i>i( ■ ) were correct, this procedure would 
provide a satisfying assignment sampled uniformly from fj,( • ). In 
fact we found a solution with finite probability (roughly 0.4), despite 
the fact that a > ad (4). The experiment was repeated at a = 9 with 
a similar fraction of successes (more data on the success probability 
will be reported in 1311 ). 

Above the condensation transition, correlations become too strong 
and the BP fixed point no longer describes the measure p. Indeed the 
same algorithm proved unsuccessful at a = 9.7 £ (a c (4), a s (4)). 
As mentioned above, SP can be regarded as an inference algorithm in 
a modified graphical model that weights preferentially small clusters. 
More precisely, it selects clusters of size e^" with s maximizing the 
complexity E(s). With respect to the new measure, the weak corre- 
lation condition in $5]\ still holds and allows to perform inference by 
message passing. 

Within the cavity formalism, the optimal choice would be to take 
r « m(a) £ [0, 1). Any parameter corresponding to a non-negative 
complexity r £ [0, m(a)] should however give good results. SP cor- 
responds to the choice r = that has some definite computational 
advantages, since messages have a compact representation in this case 
(they are real numbers). 

Cavity formalism, tree reconstruction and SP 

This Section provides some technical elements of our computation. 
The reader not familiar with this topic is invited to further consult 
Refs. (6l lllll25ll32l for a more extensive introduction. The expert 
reader will find a new derivation, and some hints of how we overcame 
technical difficulties. A detailed account shall be given in 13 1 1 1331 . 
On a tree factor graph, the marg inalsof/x(-).Eq. ID 

can be com- 
puted recursively. The edge of the factor graph from variable node i 
to constraint node a (respectively from a to i) carries "message" r\ i ^ a 
(Va^i), a probability measure on X defined as the marginal of Xi in 
the modified graphical model obtained by deleting constraint node a 
{resp. all constraint nodes around i apart from a). The messages are 
determined by the equations 



[81 



where du is the set of nodes adjacent to u, \ denotes the set sub- 
traction operation, and x A — {xj : j £ A}. These are just the BP 
equations for the model iflH . The constants Zj_> a , z a ->i are uniquely 
determined from the normalization conditions ~^2 x .Vi^ a ( Xi ) = 
^2 Va-<i{xi) = 1. In the following we refer to these equations 
by introducing functions ■ ), / a ->»( ■ ) such that 

Tji^a = fi->a(.{Vb->i}beBi\a) > V a ^i = fa-n({rj^ a }j £ a a \i) > 

[10] 

The marginals of fj, are then computed from the solution of these equa- 
tions. For instance p(xi) is a function of the messages V a ^i from 
neighboring function nodes. 

The log-number of solutions, log Z, can be expressed as a sum 
of contributions which are local functions of the messages that solve 
Eqs. flgj, H] 

lo s z = lo § *«(07<-»}) + lo s 



^log z ai (rj i ^ a ,iy a ^ t ) 



[11] 



(ai) 



where the last sum is over undirected edges in the factor graph and 

z a EE J^VafeeJ n Vi^a&i) ' 



En- 



i£Lda 



*i{Xi) 



Each term z gives the change in the number of solutions when merging 
different subtrees (for instance log Zi is the change in entropy when 
the subtrees around i are glued together). This expression coincides 
with the Bethe free-energy 1 34] as expressed in terms of messages. 

In order to move from trees to loopy graphs, we first consider an 
intermediate step in which the factor graph is still a tree but a subset of 
the variables, a; B = {xj : j £ B} is fixed. We are therefore replacing 
the measure /x( • ), cf. Eq. ffXTl . with the conditional one /i( • \x B ). In 
physics terms, the variables in specify a boundary condition. 

Notice that the measure fi( ■ \x B ) still factorizes according to (a 
subgraph of) the original factor graph. As a consequence, the condi- 
tional marginals jU(a:.j|a; B ) can be computed along the same lines as 
above. The messages i]~^ a and «^"!L>i obey Eqs. II10II . with an appro- 
priate boundary condition for messages from B. Also, the number of 
solutions that take values x B on j £ B (callit Z(x B )) can be computed 
using Eq. iflTH . 

Next, we want to consider the boundary variables themselves as 
random variables. More precisely, given r £ R, we let the boundary 
to be x B with probability 

z(x B y 



Z{r) 



[12] 



where Z(r) enforces the normalization fi(x B ) = 1, Define 

Pi^a {t}) as me probability density of rjf^ a when a; B is drawn from 
Jl, and similarly Q a -,i(i/). One can show that Eq. [JSj] implies the 
following relation between messages distributions 

Pi^a(v) 1 



Zi. 



ida\i j£da\i 



Y[ &Qb-,i{vb) 5[r]-fi-, a ({vb})] Zi^, a ({v b }) r 

b£di\a 

[13] 

where /i_» a is the function defined in Eq. f 1 1 Oil - Zi_ a is determined 
by Eq. flS]], and Zi-t a is a normalization. A similar equation holds 
for Qa-fi(f)- These coincide with the "1RSB equations" with Parisi 
parameter r. Survey propagation (SP) corresponds to a particular pa- 
rameterization of Eq. II13II (and the analogous one expressing Q a ->i 
in terms of the P's) valid for r = 0. 
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The log-partition function 4>(r) = log Z{r) admits an expression 
that is analogous to Eq. Ill 111 . 

log Z(r) = lo S MiPi^a}) + l °S Zi({Q*^i}) + 

a i 

-^2 log -Zai(Pi-a,Qa-i) [14] 
ai 

where the 'shifts' Z(- ■ ■) are defined through moments of order r 
of the z's, and sums run over vertices not in B. For instance Z a i is 
the expectation of z a i (r), vY when 77, v are independent random vari- 
ables with distribution (respectively) Pi^ a and Q a ->i- The (Shannon) 
entropy of the distribution Jl is given by E(r) = $(r) — r$'(r). 

As mentioned, the above derivation holds for tree factor graphs. 
Nevertheless, the local recursion equations II1IHI . 111311 can be used 
as an heuristics on loopy factor graphs as well. Further, although we 
justified Eq. 111311 through the introduction of a random boundary 
condition x s , we can take B = and still look for non-degenerate 
solutions of such equations. 

Starting from an arbitrary initialization of the messages, the re- 
cursions are iterated until an approximate fixed point is reached. After 
convergence, the distributions Pi^ a ,Qa->i can be used to evaluate the 
potential $(r), cf. Eq. 111411 . From this we compute the complexity 
function S(r) = $(r) — r$'(r), that gives access to the decompo- 
sition of fj,( ■ ) in pure states. More precisely, £(r) is the exponential 
growth rate of the number of states with internal entropy s = <jV(r). 
This is how curves such as in Fig. |4]are traced. 

In practice it can be convenient to consider the distributions of 
messages Pi-> a , Qa->i with respect to the graph realization. This 
approach is sometimes referred to as 'density evolution' in coding 
theory. If one consider a uniformly random directed edge i — > a 
(or a — > i) in a rCSP instance, the corresponding message will be 
a random variable. After t parallel updates according to Eq. 111311 . 
the message distribution converges (in the N — » 00 limit) to a well 
defined law Vt (for variable to constraint messages) or Q t (for con- 
straint to variable). As t — » 00, these converge to a fixed point V, Q 
that satisfy the distributional equivalent of Eq. 111311 . 

To be definite, let us consider the case of graph coloring. Since 
the compatibility functions are pairwise in this case (i.e. k — 2 in 
Eq. dU]), the constraint-to-variable messages can be eliminated and 



Eq. 111311 takes the form 

p^M oc [H dp^( Vl ) s [n - f({ m })] z({ Vl }) r , 

iedi\j 

where / is defined by r/(x) = z~ l \\ l 1 — rji(x) and z by normal- 
ization. The distribution of Pi-,j is then assumed to satisfy a distri- 
butional version of the last equation. In the special case of random 
regular graphs, a solution is obtained by assuming that Pj_^ is indeed 
independent of the graph realization and of i,j. One has therefore 
simply to set Pi^j = P in the above and solve it for P. 

In general, finding messages distributions V, Q that satisfy the 
distributional version of Eq. 111311 is an extremely challenging task, 
even numerically. We adopted the population dynamics method 1321 
which consists in representing the distributions by samples (this is 
closely related to particle filters in statistics). For instance, one rep- 
resents V by a sample of P's, each encoded as a list of r/'s. Since 
computer memory drastically limits the samples size, and thus the pre- 
cision of the results, we worked in two directions: (1) We analytically 
solved the distributional equations for large k (in the case of fc-S AT) or 
q (g-coloring); (2) We identified and exploited simplifications arising 
for special values of r. 

Let us briefly discuss point (2). Simplifications emerge for r = 
and r — 1. The first case correspond to SP: Refs. (6lll II showed 
how to compute efficiently E(r = 0) through population dynamics. 
Building on this, we could show that the clusters internal entropy 
s(r = 0) can be computed at a small supplementary cost (see 1 3 1 1). 

The value r = 1 corresponds instead to the 'tree reconstruction' 
problem [351: In this case m(S))> c f- Eq. lM2ll . coincides with the 
marginal of 11. Averaging Eq. 111311 (and the analogous one for Q a ^i) 
one obtains the BP equations ll8ll, |j9|l . e.g. J dPi-> a (ri) 77 = rj i ^ a . 
These remark can be used to show that the constrained averages 

P( V ,rj) = j dV[P] P{n) dhj- J dPfaVj , [15] 

and Q(v, v) (defined analogously) satisfy closed equations which are 
much easier to solve numerically. 

We are grateful to J. Kurchan and M. Mezard for stimulating discussions. 
This work has been partially supported by the European Commission under 
contracts EVERGROW and STIPCO. 
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